online platform
Toxicity in Online Platforms and AI Systems: A Survey of Needs, Challenges, Mitigations, and Future Directions
Khapre, Smita, Mersha, Melkamu Abay, Shakil, Hassan, Baruah, Jonali, Kalita, Jugal
The evolution of digital communication systems and the designs of online platforms have inadvertently facilitated the subconscious propagation of toxic behavior. Giving rise to reactive responses to toxic behavior. Toxicity in online content and Artificial Intelligence Systems has become a serious challenge to individual and collective well-being around the world. It is more detrimental to society than we realize. Toxicity, expressed in language, image, and video, can be interpreted in various ways depending on the context of usage. Therefore, a comprehensive taxonomy is crucial to detect and mitigate toxicity in online content, Artificial Intelligence systems, and/or Large Language Models in a proactive manner. A comprehensive understanding of toxicity is likely to facilitate the design of practical solutions for toxicity detection and mitigation. The classification in published literature has focused on only a limited number of aspects of this very complex issue, with a pattern of reactive strategies in response to toxicity. This survey attempts to generate a comprehensive taxonomy of toxicity from various perspectives. It presents a holistic approach to explain the toxicity by understanding the context and environment that society is facing in the Artificial Intelligence era. This survey summarizes the toxicity-related datasets and research on toxicity detection and mitigation for Large Language Models, social media platforms, and other online platforms, detailing their attributes in textual mode, focused on the English language. Finally, we suggest the research gaps in toxicity mitigation based on datasets, mitigation strategies, Large Language Models, adaptability, explainability, and evaluation.
- North America > Mexico > Mexico City > Mexico City (0.04)
- Europe > Poland > Łódź Province > Łódź (0.04)
- Europe > Norway > Eastern Norway > Oslo (0.04)
- (6 more...)
- Overview (1.00)
- Research Report > Promising Solution (0.45)
- Media > News (1.00)
- Leisure & Entertainment (1.00)
- Law (1.00)
- (6 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)
YouTube should not be exempt from Australia's under-16s social media ban, eSafety commissioner says
YouTube should be included in the ban on under-16s accessing social media, the nation's online safety chief has said as she urges the Albanese government to rethink its decision to carve out the video sharing platform from new rules which apply to apps such as TikTok, Snapchat and Instagram. The eSafety commissioner, Julie Inman Grant, also recommended the government update its under-16s social media ban to specifically address features such as stories, streaks and AI chatbots which can disproportionately pose risk to young people. The under-16s ban will come into effect in December 2025, despite questions over how designated online platforms would verify users' ages, and the government's own age assurance trial reporting last week that current technology is not "guaranteed to be effective" and face-scanning tools have given incorrect results. Although then communications minister Michelle Rowland initially indicated YouTube would be part of the ban legislated in December 2024, the regulations specifically exempted the Google-owned video site. Guardian Australia revealed YouTube's global chief executive personally lobbied Rowland for an exemption shortly before she announced the carve out.
- Oceania > Australia (0.65)
- Oceania > New Zealand (0.05)
- Government (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.64)
- Information Technology > Security & Privacy (0.64)
- Law > Statutes (0.51)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.56)
Multi-language Video Subtitle Dataset for Image-based Text Recognition
Singkhornart, Thanadol, Surinta, Olarik
The Multi-language Video Subtitle Dataset is a comprehensive collection designed to support research in text recognition across multiple languages. This dataset includes 4,224 subtitle images extracted from 24 videos sourced from online platforms. It features a wide variety of characters, including Thai consonants, vowels, tone marks, punctuation marks, numerals, Roman characters, and Arabic numerals. With 157 unique characters, the dataset provides a resource for addressing challenges in text recognition within complex backgrounds. It addresses the growing need for high-quality, multilingual text recognition data, particularly as videos with embedded subtitles become increasingly dominant on platforms like YouTube and Facebook. The variability in text length, font, and placement within these images adds complexity, offering a valuable resource for developing and evaluating deep learning models. The dataset facilitates accurate text transcription from video content while providing a foundation for improving computational efficiency in text recognition systems. As a result, it holds significant potential to drive advancements in research and innovation across various computer science disciplines, including artificial intelligence, deep learning, computer vision, and pattern recognition.
Lecture II: Communicative Justice and the Distribution of Attention
Algorithmic intermediaries govern the digital public sphere through their architectures, amplification algorithms, and moderation practices. In doing so, they shape public communication and distribute attention in ways that were previously infeasible with such subtlety, speed and scale. From misinformation and affective polarisation to hate speech and radicalisation, the many pathologies of the digital public sphere attest that they could do so better. But what ideals should they aim at? Political philosophy should be able to help, but existing theories typically assume that a healthy public sphere will spontaneously emerge if only we get the boundaries of free expression right. They offer little guidance on how to intentionally constitute the digital public sphere. In addition to these theories focused on expression, we need a further theory of communicative justice, targeted specifically at the algorithmic intermediaries that shape communication and distribute attention. This lecture argues that political philosophy urgently owes an account of how to govern communication in the digital public sphere, and introduces and defends a democratic egalitarian theory of communicative justice.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.28)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (12 more...)
- Media > News (1.00)
- Information Technology > Services (1.00)
- Law > Civil Rights & Constitutional Law (0.93)
- (2 more...)
Website visits can predict angler presence using machine learning
Schmid, Julia S., Simmons, Sean, Lewis, Mark A., Poesch, Mark S., Ramazi, Pouria
Understanding and predicting recreational fishing activity is important for sustainable fisheries management. However, traditional methods of measuring fishing pressure, such as surveys, can be costly and limited in both time and spatial extent. Predictive models that relate fishing activity to environmental or economic factors typically rely on historical data, which often restricts their spatial applicability due to data scarcity. In this study, high-resolution angler-generated data from an online platform and easily accessible auxiliary data were tested to predict daily boat presence and aerial counts of boats at almost 200 lakes over five years in Ontario, Canada. Lake-information website visits alone enabled predicting daily angler boat presence with 78% accuracy. While incorporating additional environmental, socio-ecological, weather and angler-generated features into machine learning models did not remarkably improve prediction performance of boat presence, they were substantial for the prediction of boat counts. Models achieved an R2 of up to 0.77 at known lakes included in the model training, but they performed poorly for unknown lakes (R2 = 0.21). The results demonstrate the value of integrating angler-generated data from online platforms into predictive models and highlight the potential of machine learning models to enhance fisheries management.
- North America > Canada > British Columbia > Vancouver Island > Capital Regional District > Victoria (0.04)
- North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
- Oceania > Australia > Western Australia > Perth (0.04)
- (11 more...)
ARTAI: An Evaluation Platform to Assess Societal Risk of Recommender Algorithms
Ruan, Qin, Xu, Jin, Dong, Ruihai, Younus, Arjumand, Mai, Tai Tan, O'Sullivan, Barry, Leavy, Susan
Societal risk emanating from how recommender algorithms disseminate content online is now well documented. Emergent regulation aims to mitigate this risk through ethical audits and enabling new research on the social impact of algorithms. However, there is currently a need for tools and methods that enable such evaluation. This paper presents ARTAI, an evaluation environment that enables large-scale assessments of recommender algorithms to identify harmful patterns in how content is distributed online and enables the implementation of new regulatory requirements for increased transparency in recommender systems.
- Europe > Ireland > Leinster > County Dublin > Dublin (0.06)
- Europe > Italy > Apulia > Bari (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
- Government (0.68)
- Law (0.68)
California is racing to combat deepfakes ahead of the election
Days after Vice President Kamala Harris launched her presidential bid, a video -- created with the help of artificial intelligence -- went viral. "I ... am your Democrat candidate for president because Joe Biden finally exposed his senility at the debate," a voice that sounded like Harris' said in the fake audio track used to alter one of her campaign ads. "I was selected because I am the ultimate diversity hire." Billionaire Elon Musk -- who has endorsed Harris' Republican opponent, former President Trump-- shared the video on X, then clarified two days later that it was actually meant as a parody. His initial tweet had 136 million views.
- North America > United States > California (0.47)
- North America > United States > Virginia (0.05)
- North America > United States > Oregon (0.05)
- (2 more...)
Bias Correction in Machine Learning-based Classification of Rare Events
Gubbels, Luuk, Puts, Marco, Daas, Piet
Online platform businesses can be identified by using web-scraped texts. This is a classification problem that combines elements of natural language processing and rare event detection. Because online platforms are rare, accurately identifying them with Machine Learning algorithms is challenging. Here, we describe the development of a Machine Learning-based text classification approach that reduces the number of false positives as much as possible. It greatly reduces the bias in the estimates obtained by using calibrated probabilities and ensembles.
- North America > United States > New Jersey (0.05)
- Europe > Netherlands > North Brabant > Eindhoven (0.05)
- Europe > France > Île-de-France > Paris > Paris (0.05)
LAUSD shelves its hyped AI chatbot to help students after collapse of firm that made it
The school district said it dropped its dealings with AllHere, the company that created "Ed," the sun-shaped chatbot, after "we were notified of their financial collapse." AllHere did not respond to an inquiry this week from The Times and the level of its operation is unclear. In a separate development, a major data breach has affected a data cloud company called Snowflake, which has worked with L.A. Unified. The district said Tuesday that there is no connection to the AllHere situation, and that it is working with investigative agencies to assess the damage and which district records were obtained through a third-party contractor. Meanwhile, the district unplugged the chatbot -- for which AllHere had been paid 3 million -- on June 14, less than three months after unveiling the animated figure as an easy-to-use, conversational companion for students and a soon-to-be-indispensable guide for parents.
- North America > United States > California > Los Angeles County > Los Angeles (0.05)
- North America > United States > Colorado > Boulder County > Boulder (0.05)
- Information Technology > Security & Privacy (1.00)
- Education (1.00)
Adaptively Learning to Select-Rank in Online Platforms
Wang, Jingyuan, Dong, Perry, Jin, Ying, Zhan, Ruohan, Zhou, Zhengyuan
Ranking algorithms are fundamental to various online platforms across e-commerce sites to content streaming services. Our research addresses the challenge of adaptively ranking items from a candidate pool for heterogeneous users, a key component in personalizing user experience. We develop a user response model that considers diverse user preferences and the varying effects of item positions, aiming to optimize overall user satisfaction with the ranked list. We frame this problem within a contextual bandits framework, with each ranked list as an action. Our approach incorporates an upper confidence bound to adjust predicted user satisfaction scores and selects the ranking action that maximizes these adjusted scores, efficiently solved via maximum weight imperfect matching. We demonstrate that our algorithm achieves a cumulative regret bound of $O(d\sqrt{NKT})$ for ranking $K$ out of $N$ items in a $d$-dimensional context space over $T$ rounds, under the assumption that user responses follow a generalized linear model. This regret alleviates dependence on the ambient action space, whose cardinality grows exponentially with $N$ and $K$ (thus rendering direct application of existing adaptive learning algorithms -- such as UCB or Thompson sampling -- infeasible). Experiments conducted on both simulated and real-world datasets demonstrate our algorithm outperforms the baseline.
- Europe > Austria > Vienna (0.14)
- North America > United States > New York (0.04)
- Asia > China > Hong Kong (0.04)
- (2 more...)
- Information Technology > Services (0.48)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.34)
- Media > Television (0.34)